Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Immediate remote cancels #245

Merged
merged 25 commits into from
Oct 17, 2021
Merged

Immediate remote cancels #245

merged 25 commits into from
Oct 17, 2021

Conversation

goodboy
Copy link
Owner

@goodboy goodboy commented Oct 13, 2021

This adds greedy remote cancellation to the msg loop making both task and actor-runtime cancel requests more immediate in the sense they are no longer scheduled for run with .start() in the service nursery and instead are _invoke()d directly and served asap.

These are pieces from an ongoing rework of the actor nursery going on over in https://github.com/goodboy/tractor/tree/zombie_lord_infinite, which is basically moving away entirely from the original multiplexed error handling style to per-task-managing-spawned-process style. The idea here is to make the actual nursery implementation much simpler whilst keeping the harder stuff (like was this actor-task cancelled, and do we care?) is done in the spawning back-end on a per task level thereby minimizing global state in supervisor strategy implementations.

Mostly putting this up independent of the spawning/nursery rework stuff to see how CI responds to the timing changes in the core msg loop.

@goodboy goodboy added supervision cancellation SC teardown semantics and anti-zombie semantics labels Oct 13, 2021
@goodboy
Copy link
Owner Author

goodboy commented Oct 13, 2021

Lul and so the bloodbath begins.

Base automatically changed from less_logging to master October 14, 2021 17:37
As for `Actor.cancel()` requests, do the same for
`Actor._cancel_task()` but use `_invoke()` to ensure
correct msg transactions with caller. Don't cancel task
cancels on a cancel-all-tasks operation in attempt at
more determinism.
This is actually surprisingly easy to grok having gone through a lot of
pain understanding edge cases in the zombie lord dev branch. Basically
we just need to make sure actors are managed in a 2 step reap sequence.
In the "soft" reap phase we wait for the process to terminate on its own
concurrently with (maybe) waiting for its portal's final result (if it's
a `.run_in_actor()`). If this path is cancelled or errors, then we do
a "hard" reap where we timeout and send a signal to the proc to
terminate immediately. The only last remaining trick is to tie in the
root-is-debugger-aware logic to yet again avoid tty clobbers.
With the new fixes to the trio spawner we can expect that both root
*and* depth > 1 nursery owning actors will now not clobber any children
that are in debug (either via breakpoint or through crashing). The tests
changed now include more checks which ensure the 2nd level parent-ish
actors also bubble up through into `pdb` and don't kill any of their
(crashed) children before they're done themselves debugging.
@goodboy goodboy force-pushed the immediate_remote_cancels branch from 9b3e0a5 to 51259c4 Compare October 14, 2021 17:46
@goodboy goodboy marked this pull request as ready for review October 14, 2021 20:01
@goodboy goodboy force-pushed the immediate_remote_cancels branch from 12675f4 to 4f222a5 Compare October 15, 2021 14:26
@goodboy goodboy force-pushed the immediate_remote_cancels branch 2 times, most recently from 34ed193 to 3f4384b Compare October 15, 2021 22:25
@goodboy goodboy force-pushed the immediate_remote_cancels branch from 3f4384b to b3c4851 Compare October 15, 2021 22:26
log.cancel(
f"Actor {self.uid} was remotely cancelled; "
"waiting on cancellation completion..")
await _invoke(self, cid, chan, func, kwargs, is_rpc=False)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for reference, this is the crux of the change: instead of scheduling Actor.cancel() and ._cancel_task() as is done for rpc endpoints, we invoke these methods (which really are handlers for special messages that we should add to our SC protocol) immediately via await.

)
await self.cancel_rpc_tasks(chan)

# end of async for, channel disconnect vis ``trio.EndOfChannel``
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo via

):
'''
Connect to the root actor via a ctx and invoke a task which locks
a root-local TTY lock.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop lock from end.

@goodboy goodboy merged commit 828754d into master Oct 17, 2021
@goodboy goodboy deleted the immediate_remote_cancels branch October 17, 2021 12:16
overclockworked64 added a commit to overclockworked64/tractor that referenced this pull request Oct 22, 2021
@goodboy goodboy mentioned this pull request Nov 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cancellation SC teardown semantics and anti-zombie semantics supervision
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant